Search CORE

194 research outputs found

Individual Fairness in Pipelines

Author: Dwork Cynthia
Ilvento Christina
Jagadeesan Meena
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 1st Symposium on Foundations of Responsible Computing (FORC 2020)
Publication date: 01/01/2020
Field of study

It is well understood that a system built from individually fair components may not itself be individually fair. In this work, we investigate individual fairness under pipeline composition. Pipelines differ from ordinary sequential or repeated composition in that individuals may drop out at any stage, and classification in subsequent stages may depend on the remaining "cohort" of individuals. As an example, a company might hire a team for a new project and at a later point promote the highest performer on the team. Unlike other repeated classification settings, where the degree of unfairness degrades gracefully over multiple fair steps, the degree of unfairness in pipelines can be arbitrary, even in a pipeline with just two stages. Guided by a panoply of real-world examples, we provide a rigorous framework for evaluating different types of fairness guarantees for pipelines. We show that na\"{i}ve auditing is unable to uncover systematic unfairness and that, in order to ensure fairness, some form of dependence must exist between the design of algorithms at different stages in the pipeline. Finally, we provide constructions that permit flexibility at later stages, meaning that there is no need to lock in the entire pipeline at the time that the early stage is constructed

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Efficient Algorithms for Privately Releasing Marginals via Convex Relaxations

Author: Dwork Cynthia
Nikolov Aleksandar
Talwar Kunal
Publication venue
Publication date: 06/08/2013
Field of study

Consider a database of

n

people, each represented by a bit-string of length

d

corresponding to the setting of

d

binary attributes. A

k

-way marginal query is specified by a subset

S

k

attributes, and a

|S|

-dimensional binary vector

\beta

specifying their values. The result for this query is a count of the number of people in the database whose attribute vector restricted to

S

agrees with

\beta

. Privately releasing approximate answers to a set of

k

-way marginal queries is one of the most important and well-motivated problems in differential privacy. Information theoretically, the error complexity of marginal queries is well-understood: the per-query additive error is known to be at least

\Omega(\min\{\sqrt{n},d^{\frac{k}{2}}\})

and at most

\tilde{O}(\min\{\sqrt{n} d^{1/4},d^{\frac{k}{2}}\})

. However, no polynomial time algorithm with error complexity as low as the information theoretic upper bound is known for small

n

. In this work we present a polynomial time algorithm that, for any distribution on marginal queries, achieves average error at most

\tilde{O}(\sqrt{n} d^{\frac{\lceil k/2 \rceil}{4}})

. This error bound is as good as the best known information theoretic upper bounds for

k=2

. This bound is an improvement over previous work on efficiently releasing marginals when

k

is small and when error

o(n)

is desirable. Using private boosting we are also able to give nearly matching worst-case error bounds. Our algorithms are based on the geometric techniques of Nikolov, Talwar, and Zhang. The main new ingredients are convex relaxations and careful use of the Frank-Wolfe algorithm for constrained convex minimization. To design our relaxations, we rely on the Grothendieck inequality from functional analysis

arXiv.org e-Print Archive

CiteSeerX

Improved Generalization Guarantees in Restricted Data Models

Author: Du Elbert
Dwork Cynthia
Publication venue
Publication date: 01/01/2022
Field of study

Differential privacy is known to protect against threats to validity incurred due to adaptive, or exploratory, data analysis -- even when the analyst adversarially searches for a statistical estimate that diverges from the true value of the quantity of interest on the underlying population. The cost of this protection is the accuracy loss incurred by differential privacy. In this work, inspired by standard models in the genomics literature, we consider data models in which individuals are represented by a sequence of attributes with the property that where distant attributes are only weakly correlated. We show that, under this assumption, it is possible to "re-use" privacy budget on different portions of the data, significantly improving accuracy without increasing the risk of overfitting.Comment: 13 pages, published in FORC 202

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Abstracting Fairness: Oracles, Metrics, and Interpretability

Author: Dwork Cynthia
Ilvento Christina
Rothblum Guy N.
Sur Pragya
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 1st Symposium on Foundations of Responsible Computing (FORC 2020)
Publication date: 01/01/2020
Field of study

It is well understood that classification algorithms, for example, for deciding on loan applications, cannot be evaluated for fairness without taking context into account. We examine what can be learned from a fairness oracle equipped with an underlying understanding of ``true'' fairness. The oracle takes as input a (context, classifier) pair satisfying an arbitrary fairness definition, and accepts or rejects the pair according to whether the classifier satisfies the underlying fairness truth. Our principal conceptual result is an extraction procedure that learns the underlying truth; moreover, the procedure can learn an approximation to this truth given access to a weak form of the oracle. Since every ``truly fair'' classifier induces a coarse metric, in which those receiving the same decision are at distance zero from one another and those receiving different decisions are at distance one, this extraction process provides the basis for ensuring a rough form of metric fairness, also known as individual fairness. Our principal technical result is a higher fidelity extractor under a mild technical constraint on the weak oracle's conception of fairness. Our framework permits the scenario in which many classifiers, with differing outcomes, may all be considered fair. Our results have implications for interpretablity -- a highly desired but poorly defined property of classification systems that endeavors to permit a human arbiter to reject classifiers deemed to be ``unfair'' or illegitimately derived.Comment: 17 pages, 1 figur

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Proving Differential Privacy with Shadow Execution

Author: Anna
Beyer Dirk
Dwork Cynthia
Dwork Cynthia
Hardt Moritz
Hunt Sebastian
Mironov I.
Nissim Kobbi
Privacy Team Apple Differential
Roy Indrajit
Xu Lili
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 30/06/2019
Field of study

Recent work on formal verification of differential privacy shows a trend toward usability and expressiveness -- generating a correctness proof of sophisticated algorithm while minimizing the annotation burden on programmers. Sometimes, combining those two requires substantial changes to program logics: one recent paper is able to verify Report Noisy Max automatically, but it involves a complex verification system using customized program logics and verifiers. In this paper, we propose a new proof technique, called shadow execution, and embed it into a language called ShadowDP. ShadowDP uses shadow execution to generate proofs of differential privacy with very few programmer annotations and without relying on customized logics and verifiers. In addition to verifying Report Noisy Max, we show that it can verify a new variant of Sparse Vector that reports the gap between some noisy query answers and the noisy threshold. Moreover, ShadowDP reduces the complexity of verification: for all of the algorithms we have evaluated, type checking and verification in total takes at most 3 seconds, while prior work takes minutes on the same algorithms.Comment: 23 pages, 12 figures, PLDI'1

arXiv.org e-Print Archive

Crossref